Goto

Collaborating Authors

 sample covariance


Holdout cross-validation for large non-Gaussian covariance matrix estimation using Weingarten calculus

Lamrani, Lamia, Collins, Benoît, Bouchaud, Jean-Philippe

arXiv.org Machine Learning

Cross-validation is one of the most widely used methods for model selection and evaluation; its efficiency for large covariance matrix estimation appears robust in practice, but little is known about the theoretical behavior of its error. In this paper, we derive the expected Frobenius error of the holdout method, a particular cross-validation procedure that involves a single train and test split, for a generic rotationally invariant multiplicative noise model, therefore extending previous results to non-Gaussian data distributions. Our approach involves using the Weingarten calculus and the Ledoit-Péché formula to derive the oracle eigenvalues in the high-dimensional limit. When the population covariance matrix follows an inverse Wishart distribution, we approximate the expected holdout error, first with a linear shrinkage, then with a quadratic shrinkage to approximate the oracle eigenvalues. Under the linear approximation, we find that the optimal train-test split ratio is proportional to the square root of the matrix dimension. Then we compute Monte Carlo simulations of the holdout error for different distributions of the norm of the noise, such as the Gaussian, Student, and Laplace distributions and observe that the quadratic approximation yields a substantial improvement, especially around the optimal train-test split ratio. We also observe that a higher fourth-order moment of the Euclidean norm of the noise vector sharpens the holdout error curve near the optimal split and lowers the ideal train-test ratio, making the choice of the train-test ratio more important when performing the holdout method.


A Warm up Example Sample Covariance and Mar Equation

Neural Information Processing Systems

Results obtained by averaging over 50 runs. We start by introducing the following lemma. To prove Theorem 2, it indeed suffices to prove the following lemma. ' Q and depends on U Qy which, up to further simplifications, concludes the proof of Theorem 3. F or any > 0 and defined in (8), we have 1. Developing the inverse we obtain = " 1 Results obtained by averaging over 30 runs.


Autonomous Robotic Radio Source Localization via a Novel Gaussian Mixture Filtering Approach

Kim, Sukkeun, Moon, Sangwoo, Petrunin, Ivan, Shin, Hyo-Sang, Khattak, Shehryar

arXiv.org Artificial Intelligence

This study proposes a new Gaussian Mixture Filter (GMF) to improve the estimation performance for the autonomous robotic radio signal source search and localization problem in unknown environments. The proposed filter is first tested with a benchmark numerical problem to validate the performance with other state-of-practice approaches such as Particle Gaussian Mixture (PGM) filters and Particle Filter (PF). Then the proposed approach is tested and compared against PF and PGM filters in real-world robotic field experiments to validate its impact for real-world robotic applications. The considered real-world scenarios have partial observability with the range-only measurement and uncertainty with the measurement model. The results show that the proposed filter can handle this partial observability effectively whilst showing improved performance compared to PF, reducing the computation requirements while demonstrating improved robustness over compared techniques.


Covariance shrinkage for autocorrelated data

Daniel Bartz, Klaus-Robert Müller

Neural Information Processing Systems

The accurate estimation of covariance matrices is essential for many signal processing and machine learning algorithms. In high dimensional settings the sample covariance is known to perform poorly, hence regularization strategies such as analytic shrinkage of Ledoit/Wolf are applied.


Learning convolution filters for inverse covariance estimation of neural network connectivity

George Mohler

Neural Information Processing Systems

We consider the problem of inferring direct neural network connections from Calcium imaging time series. Inverse covariance estimation has proven to be a fast and accurate method for learning macro-and micro-scale network connectivity in the brain and in a recent Kaggle Connectomics competition inverse covariance was the main component of several top ten solutions, including our own and the winning team's algorithm. However, the accuracy of inverse covariance estimation is highly sensitive to signal preprocessing of the Calcium fluorescence time series. Furthermore, brute force optimization methods such as grid search and coordinate ascent over signal processing parameters is a time intensive process, where learning may take several days and parameters that optimize one network may not generalize to networks with different size and parameters. In this paper we show how inverse covariance estimation can be dramatically improved using a simple convolution filter prior to applying sample covariance. Furthermore, these signal processing parameters can be learned quickly using a supervised optimization algorithm. In particular, we maximize a binomial log-likelihood loss function with respect to a convolution filter of the time series and the inverse covariance regularization parameter. Our proposed algorithm is relatively fast on networks the size of those in the competition (1000 neurons), producing AUC scores with similar accuracy to the winning solution in training time under 2 hours on a cpu. Prediction on new networks of the same size is carried out in less than 15 minutes, the time it takes to read in the data and write out the solution.


Asymptotic spectrum of weighted sample covariance: a Marcenko-Pastur generalization

Oriol, Benoit

arXiv.org Machine Learning

We propose an extension of the high dimensional spectrum analysis of sample covariance in the setting of the weighted sample covariance. We derive an asymptotic equation characterizing the limit density of the weighted sample eigenvalues generalizing for weighted sample covariance matrices the Marcenko-Pastur theorem.


WeSpeR: Population spectrum retrieval and spectral density estimation of weighted sample covariance

Oriol, Benoit

arXiv.org Machine Learning

The spectrum of the weighted sample covariance shows a asymptotic non random behavior when the dimension grows with the number of samples. In this setting, we prove that the asymptotic spectral distribution $F$ of the weighted sample covariance has a continuous density on $\mathbb{R}^*$. We address then the practical problem of numerically finding this density. We propose a procedure to compute it, to determine the support of $F$ and define an efficient grid on it. We use this procedure to design the $\textit{WeSpeR}$ algorithm, which estimates the spectral density and retrieves the true spectral covariance spectrum. Empirical tests confirm the good properties of the $\textit{WeSpeR}$ algorithm.


Asymptotic non-linear shrinkage formulas for weighted sample covariance

Oriol, Benoit

arXiv.org Machine Learning

We compute asymptotic non-linear shrinkage formulas for covariance and precision matrix estimators for weighted sample covariances, in the spirit of Ledoit and P\'ech\'e. We detail explicitly the formulas for exponentially-weighted sample covariances. Those new tools pave a way for applying non-linear shrinkage methods on weighted sample covariance. We show experimentally the performance of the asymptotic shrinkage formulas. Finally, we test the robustness of the theory to a heavy-tailed distributions.

  Country:
  Genre: Research Report (0.50)

Analysis of a multi-target linear shrinkage covariance estimator

Oriol, Benoit

arXiv.org Machine Learning

Multi-target linear shrinkage is an extension of the standard single-target linear shrinkage for covariance estimation. We combine several constant matrices - the targets - with the sample covariance matrix. We derive the oracle and a \textit{bona fide} multi-target linear shrinkage estimator with exact and empirical mean. In both settings, we proved its convergence towards the oracle under Kolmogorov asymptotics. Finally, we show empirically that it outperforms other standard estimators in various situations.


It is all in the noise: Efficient multi-task Gaussian process inference with structured residuals Christoph Lippert Machine Learning and Computational Biology Microsoft Research Research Group

Neural Information Processing Systems

Multi-task prediction methods are widely used to couple regressors or classification models by sharing information across related tasks. We propose a multi-task Gaussian process approach for modeling both the relatedness between regressors and the task correlations in the residuals, in order to more accurately identify true sharing between regressors. The resulting Gaussian model has a covariance term in form of a sum of Kronecker products, for which efficient parameter inference and out of sample prediction are feasible. On both synthetic examples and applications to phenotype prediction in genetics, we find substantial benefits of modeling structured noise compared to established alternatives.